Goto

Collaborating Authors

 straight-through estimator




PathSample-AnalyticGradientEstimators forStochasticBinaryNetworks

Neural Information Processing Systems

We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional modelswithbothproposedmethods.


LOTION: Smoothing the Optimization Landscape for Quantized Training

arXiv.org Artificial Intelligence

Optimizing neural networks for quantized objectives is fundamentally challenging because the quantizer is piece-wise constant, yielding zero gradients everywhere except at quantization thresholds where the derivative is undefined. Most existing methods deal with this issue by relaxing gradient computations with techniques like Straight Through Estimators (STE) and do not provide any guarantees of convergence. In this work, taking inspiration from Nesterov smoothing, we approximate the quantized loss surface with a continuous loss surface. In particular, we introduce LOTION, \textbf{L}ow-precision \textbf{O}ptimization via s\textbf{T}ochastic-no\textbf{I}se sm\textbf{O}othi\textbf{N}g, a principled smoothing framework that replaces the raw quantized loss with its expectation under unbiased randomized-rounding noise. In this framework, standard optimizers are guaranteed to converge to a local minimum of the loss surface. Moreover, when using noise derived from stochastic rounding, we show that the global minima of the original quantized loss are preserved. We empirically demonstrate that this method outperforms standard QAT on synthetic testbeds and on 150M- and 300M- parameter language models.


Principled Approximation Methods for Efficient and Scalable Deep Learning

arXiv.org Artificial Intelligence

Recent progress in deep learning has been driven by increasingly larger models. However, their computational and energy demands have grown proportionally, creating significant barriers to their deployment and to a wider adoption of deep learning technologies. This thesis investigates principled approximation methods for improving the efficiency of deep learning systems, with a particular focus on settings that involve discrete constraints and non-differentiability. We study three main approaches toward improved efficiency: architecture design, model compression, and optimization. For model compression, we propose novel approximations for pruning and quantization that frame the underlying discrete problem as continuous and differentiable, enabling gradient-based training of compression schemes alongside the model's parameters. These approximations allow for fine-grained sparsity and precision configurations, leading to highly compact models without significant fine-tuning. In the context of architecture design, we design an algorithm for neural architecture search that leverages parameter sharing across layers to efficiently explore implicitly recurrent architectures. Finally, we study adaptive optimization, revisiting theoretical properties of widely used methods and proposing an adaptive optimizer that allows for quick hyperparameter tuning. Our contributions center on tackling computationally hard problems via scalable and principled approximations. Experimental results on image classification, language modeling, and generative modeling tasks show that the proposed methods provide significant improvements in terms of training and inference efficiency while maintaining, or even improving, the model's performance.



Extending Straight-Through Estimation for Robust Neural Networks on Analog CIM Hardware

arXiv.org Artificial Intelligence

--Analog Compute-In-Memory (CIM) architectures promise significant energy efficiency gains for neural network inference, but suffer from complex hardware-induced noise that poses major challenges for deployment. While noise-aware training methods have been proposed to address this issue, they typically rely on idealized and differentiable noise models that fail to capture the full complexity of analog CIM hardware variations. We provide theoretical analysis demonstrating that our approach preserves essential gradient directional information while maintaining computational tractability and optimization stability. Extensive experiments show that our extended STE framework achieves up to 5.3% accuracy improvement on image classification, 0.72 perplexity reduction on text generation, 2.2 speedup in training time, and 37.9% lower peak memory usage compared to standard noise-aware training methods. The exponential growth of neural network applications has intensified demand for energy-efficient computing solutions, particularly for edge devices with severe power and computational constraints [1], [2]. Analog Compute-In-Memory (CIM) architectures address these challenges by performing matrix-vector multiplications directly within memory arrays, eliminating energy-intensive data movement and achieving orders of magnitude energy efficiency improvements over traditional von Neumann architectures through analog weight storage and physical law-based computation [3], [4].


Clarify Technical Contributions (R3 / R4): 2 Gradient Estimation

Neural Information Processing Systems

We thank all reviewers for their detailed constructive feedback and suggestions. Table B (below) demonstrates this empirically. Gumbel-Softmax has) with significantly less training time and resource consumption. These experiments show that when trained with Gumbel-CRF, the AR decoder outperforms REINFORCE. We will clarify this in the paper.



The Fourth State: Signed-Zero Ternary for Stable LLM Quantization (and More)

arXiv.org Artificial Intelligence

Quantization is typically viewed as a pragmatic trade-off between model fidelity and computational costs [2, 3, 13]. Aggressive 2-bit ternary-state schemes are now commonly used to allow large-language models (LLMs) to run on commodity accelerators and edge devices. However, this leads to training-time issues resulting from intervals in which the quantizer output is numerically zero and the surrogate gradient vanishes. These near-zero intevals are referred to as "dead zones" [11]. We introduce a Signed-Zero Ternary (SZT) quantization in which we use the remaining fourth state in the 2-bit ternary encoding to distinguish two zero states (code words). This approach retains the benefits of ternary-state quantization while adding 1-bit gradient information at essentially no cost. This preserves the forward-path behavior of balanced ternary while the back-propagation rule remains fully deterministic for the straight-through form. We argue that availability of gradient information in this maximally quantized representation may tend to maximize overall information density rather than approximate it. All analytical results are obtained via changes to the encode/decode logic only, leaving the matrix-multiply datapath untouched, i.e., 1